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Abstract. Relational databases (DBs) are ideal tools to 
manage bulky and structured data archives. In particu- 
lar for Astronomy they can be used to fulfill all the re- 
quirements of a complex project, i.e. the management of: 
documents, software (s/w) packages and logs, observation 
schedules, object catalogues, quick-look, simulated, raw 
and processed data, etc. All the information gathered in a 
relational DB is easily and simultaneously accessible either 
from an interactive tool or a batch program. The user does 
not need to deal with traditional files I/O or editing, but 
has only to build the appropriate (SQL) query which will 
return the desired information/data, eventually produc- 
ing the aforementioned files or even plots, tables, etc. in a 
variety of formats. What is then important for a generic 
user is to have the tools to easily and quickly develop, in 
any desired programming language, the custom s/w which 
can import/export the information into/from the DB. An 
example could be a Web interface which presents the avail- 
able data and allows the user to select/retrieve (or even 
process) the data subset of interest. In the last years we 
have been implementing a package called MCS (see dedi- 
cated paper in this proceedings) which allows users to in- 
teract with MySQL based DBs through any programming 
language. MCS has a multi-thread (socket) architecture 
which means that several clients can submit queries to 
a server which in turn manages the communication with 
the MySQL server and other MCS servers. Here we'll fo- 
cus on a the real-world case of the robotic IR-optical tele- 
scope REM (placed at La Silla, Chile) which performs real 
time images acquisition, processing and archiving by us- 
ing some of the MCS capabilities. Interested people can 
visit ross.iasfbo.inaf.it to have a hint of the potential of 
DB-based data management. 



1. Introduction 

Nowadays medium-large size astronomical projects have 
to face the management of a large amount of informa- 
tion and data. Typically dedicated data centres manage 
the collection of raw and pre-processed data and conse- 
quently make them accessible to the (authorized) users. 



Access is performed either via (s)ftp or http(s) (Web) and 
typically foresees only files transfer. The selection of the 
data of interest is usually performed acting on a few pa- 
rameters (e.g. object name or coordinates). In a few cases, 
when large amounts of data are involved, no (or little) 
data transfer is allowed but the user can submit batch 
jobs that return the results of a particular analysis. In 
other, less common, cases the data are delivered to the 
user on tapes, DVDs, etc. In all cases the data acquisition, 
archiving, delivering, processing and the results accessibil- 
ity are managed separately. Often the information are not 
collected into relational databases tables and when this 
happens, the delay between the date of collection and the 
archiving is of the order of days or even months. The same 
happens for the data production logging and project doc- 
umentation. Luckily the use, in many cases, of standard 
file formats like FITSS can help to track the data origin 
and processing status. 

International projects like those GRID based (see 
e.g. www.grid.org, www.coregrid.net, grid.infn.it, omii- 
europe.org, etc.) and the International Virtual Observa- 
tory Alliance (IVOA - www.ivoa.net) represent an effort 
to give a robust and standard framework for data archiv- 
ing, analysis and retrieval to physicists and astronomers. 
However these projects size and ambitions cause them to 
proceed quite slowly and the potential users do not get im- 
mediate advantages from them. Large ground and space 
based Observatories usually put some effort into obser- 
vations bookkeeping and data accessibility by the users. 
Small and medium projects/experiments instead tend to 
optimize the data management for their internal use only. 

Finally we note that the usage of standard data format 
have allowed the development of standard analysis pack- 
ages, which eventually can be easily adapted to meet the 
requirements of new projects. 



2. Databases in astronomy 

The usage of databases to store data collected by astro- 
nomical instruments/experiments is very common. Still, in 



See FITS Web page: fits.gsfc.nasa.gov 
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the majority of the cases, they simply contain the infor- 
mation about the collected data (date, object, wavelength, 
etc.) or/and a list of objects with their observed and de- 
rived characteristics (catalogues). In a few cases some 
level of remote processing is permitted (see e.g. ASDC 
- www.asdc.asi.it). Moreover accessing these information 
is permitted only via Web browsers (or http client em- 
ulators) or via dedicated programs which typically also 
require the data to be on the same machine where the 
program runs. Also when very advanced databases sys- 
tems were implemented, like the one used by the SDSS 
project (www.sdss.org) which allows a direct access (again 
via http) with user built SQL queries, a "direct" com- 
munication between the user program and the database 
system is not allowed. One needs for example to submit 
the query, collect the output into an ASCII file and then 
perform all the other desired analyses on his own machine. 

The Virtual Observatory project is aimed at remov- 
ing the obstacles users have in finding and accessing the 
data, (cross)processing them and at last retrieve the re- 
sults, whatever they are: images, plots, tables, etc. Still it 
does not foresee a "low-level" user interaction. 

But why is it so important to make extensive use of 
databases in Astronomy? Here is a short list of answers: 

— can track in an ordered form what a project produces 
and let the rest of the world know it; 

— can manage all the information aspects of a project 
within a single framework; 

— don't need to worry about data management but con- 
centrate on the analysis and interpretation; 

— make data accessibility uniform for all the Observato- 
ries/Projects from any computer on the Internet. 

3. Our proposed system: MCS 

As mentioned above, the basic idea is that it is easier and 
more efficient to use databases for almost all the aspects 
related to a modern experiment /project in Astronomy. 
Archives with documents, s/w packages, data logs, observ- 
ing schedules, objects catalogues, simulated data, quick- 
look graphs, raw and processed data, all can be managed 
by a modern database server without caring about com- 
puter architecture, programming language, access security 
and even about data sharing, backup and restore. 

What do we propose? A system with users man- 
agement, multi-threading capabilities, customizable, al- 
lowing inter-process messaging and file transfer, DB in- 
put/output from any internet node and using any pro- 
gramming language. Database insert/select queries can 
be performed by mapping the data into parameters ar- 
rays or structures (generally seen as tables with columns 
of different types) or files of various types including FITS, 
VOTable. Selected data can be filtered through s/w com- 
ponents producing graphs in vector (e.g. Postscript) or 
bitmap format (e.g. GIF), and so on. 



Such a system would be also very appropriate to man- 
age experiments in real time. Health and data acquisi- 
tion status, automatic analysis results can be monitored 
from any place on the internet. The main advantages for a 
project collaboration would be: information easy to find, 
ready to use pre-processed data, shared high level pro- 
cessing s/w (automatic or on demand), per user backup 
and restore, data access security and easy replication. And 
again the users can have direct access to such a system by 
using custom s/w or Web based user interfaces. In other 
terms, the common tasks are performed on the server side 
whereas clients s/w (running on the user computer) can 
concentrate on specific analysis on the retrieved data. 

3.1. The MCS library 

In the last years we have been developing a package which 
meets all the above listed requirements: MCS (Calderone 
& Nicastro, this proceedings). An MCS based data man- 
ager system has the characteristics of a traditional DB 
based manager system but with the addition of several 
crucial advantages. It is flexible enough to allow users to 
easily and quickly develop tools to manage observation 
schedules and logs, real time data archiving and process- 
ing. It has a built-in user's privilege system, SSL encryp- 
tion and automatic management of commonly used file 
formats. In addition it allows users to easily distribute the 
data processing among various machines and keep track 
of the status via DB log tables. The MCS library has an 
interactive shell and it is interfaced toward (almost) all 
programming languages; this means that whereas an MCS 
server has to be written in C++, any other DB accessing 
program can be written in any language. This permits 
an easy integration of existing and newly developed s/w 
within a collaboration where the participants most likely 
don't use one single programming language. 

We have also started including user contributed (MCS 
based) and external libraries in the various languages to 
make even easier to perform DB communication and typ- 
ical astronomical analysis/calculations like simple fitting, 
sky mapping, coordinates and time conversions, astromet- 
ric calculations. These libraries include well known and 
tested packages like: 

— Hierarchical Triangular Mesh (HTM 
www.sdss.jhu.edu/htm/) used for object catalogues 
indexing; 

— Hierarchical Equal Area isoLatitude Pixelization 
(HEALPix - healpix.jpl.nasa.gov) used to produce sky 
maps; 

— Naval Observatory Vector Astrometry Subroutines 
(NOVAS - aa.usno.navy.mil/software/novas/) used for 
computing astromctric quantities and transformations. 

This in one single library which, in his simplest form, can 
be compiled with one single dependence: the MySQL (free- 
ware) library. It is also worth noting that in the future we 



L. Nicastro & G. Calderone: Concepts for astronomical data accessibility and analysis via relational database 



221 



plan to support DB systems other then MySQL. Data in- 
put / output can be performed in several standard formats 
like XML, FITS and VOTable. The latter immediately 
makes accessible data and products to a Virtual Observa- 
tory (VO). Noticeably communication between an MCS 
server and a VO allows to get real time view and access 
to the Observatory products. In other words the Virtual 
meets the Real. We plan to perform "real" tests in col- 
laboration with Institutions involved in the IVOA in the 
near future. 



4. REM ROSS 



REM (Rapid Eye Mount. ICovino et al. 2004|) is a robotic 
telescope equipped with IR (REMIR) and optical (ROSS, 
ITosti et al. 2004) cameras aimed mainly at catching 
GRBs afterglows as fast as possible. It is also used to mon- 
itor variable objects and to perform ToOs observations of 
other interesting objects. ROSS can produce direct or dis- 
persed (via an Amici prism) optical images. Observation 
logging and real-time image processing/archiving is per- 
formed accessing local and remote DBs. As soon as the 
image is (pre)processed, it is available to the owner in the 
database. As usual it is accessible from any internet node. 
A web interface (written in PHP) allows a simple and fast 
access to the log and products (images and spectra). Each 
user has his/her own account and can access only propri- 
etary data whereas the observation log is freely accessible. 
It is very easy to implement new facilities performing more 
tasks on images or spectra. 

Moreover all the REM project documents, papers, 
pictures, etc. are stored into DB tables and are acces- 
sible through PHP written dynamic web pages. In ad- 
dition people have a web accessible "work area" repos- 
itory useful to exchange any kind of file. The REM 
observation scheduling and status information system 
(see ross.iasfbo.inaf.it/^trem/), which was initially imple- 
mented to work with ASCII files rather than with DB 
tables, will soon start work also in the MCS environment. 



4-1. HTM indexed catalogues 

In order to quickly access IR/optical objects catalogues 
to discriminate newly discovered objects in the observed 
fields, we have ported into DB tables many of them. The 
only relevant difference respect to the original ones is the 
fact that they are all indexed with the HTM scheme, which 
in turn allows a natural DB indexing of the tables. Typ- 
ically a query to a one billion objects catalogue like the 
GSC 2.3, on a 10 x 10 arcmin area (which is the REM field 
of view), takes ~ 20 ms. Thanks to MCS, these catalogues 
can be queried by any (authorized) internet user directly 
with his own program, written in any language. A stan- 
dalone program (written in C-l — (-) is available to perform 
simple select queries and get the result in various formats. 
Moreover a web interface (written in PHP) allows users to 
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Fig. 1. Schematic diagram of the ROSS observing s/w. 
Access to the MySQL custom, all sky catalogues, is per- 
formed when searching for UFOs. 

perform interactive queries with graphical visualization of 
the selected objects. All the catalogues are accessible at 
ross.iasfbo.inaf.it. 

4-2. The ROSS images manager 

The ROSS camera manager is in charge of setting the ob- 
servation parameters and performing the images collection 
as FITS files. Another s/w component (RossOPipe) man- 
ages the objects extraction and matching with the list of 
objects present in the reference (or other) catalogues (e.g. 
GSC 2.3), this in order to check for the presence of new 
(we call them UFOs) objects. A schematic flow chart is 
shown in Fig. [TJ Finally all the relevant information are 
collected in DB tables and made immediately available to 
the user which can view them either from a web interface 
or via custom programs. 

4-3. The Web based data access 

The observation log, images and spectral data can be 
browsed, (partially) processed and retrieved in real time 
via a PHP written web interface. Again, thanks to MCS, 
the same tasks can be performed using any other lan- 
guage, though interpreted languages like PHP or Python 
are more suitable for web pages creation. The advantages 
of having a centralized archiver/processing system with 
an easy access guarantees: 

— minimum disk space occupancy (in general only post- 
processing results need to be transfered on the user 
machine) ; 

— easy backup / restore; 

— easy s/w maintenance. 

A record level privilege system allows a selective view 
of the data in the database table. Each user can only view 
and access owned images together with the calibration 
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Fig. 2. The ROSS images browser uses a simple Web in- 
terface. Getting information about images and detected 
objects requires a few clicks. Here a V observation of the 
AGN NGC 2375 is shown. 




Fig. 3. To view the objects present in various (MySQL 
converted tables) the Perl written tool myCatChart is used. 
Here the USNO B1.0 objects in a 10' x 10' region around 
NGC 2375 are shown. It's on the Galactic plane! 



files and the observation log. Selection of a sub-sample of 
images and browsing/ viewing the images on the Web in- 
terface is very simple (see Fig. [2]) . Also getting the list of 
objects in the image with their photometric and astromet- 
ric characteristics requires one click. Again only one click 
to view the sky chart of the objects listed in various all- 
sky catalogues (see Fig. [3]). The automatic spectral data 
analysis requires only to click on the spectra to get them 
plotted and have the corresponding FITS binary tables in 
counts or flux units ready to be viewed/downloaded (see 
Fig. [4j. Note that all these operations are performed in 
real time. 



<~ All f 7 Only |Amici C K goi where | Object £ |5A110_1 5 | 

| ImgDate 3 I Ascending ^^^^ | LunflTab ^Cooids | STD string jj [20 objs.p? | 




r pit>t a* 




Fig. 4. Browsing a spectral image and getting the plots 
and FITS files cannot be easier: just click on the im- 
age/spectrum. 

proposed the usage of a single package (MCS, Calderone 
& Nicastro, this volume) which allows users to manage 
all the aspects of a project, all built over a relational 
database system. Such data would then be more effec- 
tively exploitable by the astronomical community at large 
for example for multi-wavelength studies and for access 
from the various Virtual Observatories. 

We welcome any interested group or single researcher 
willing to contribute in any aspect of this project. 
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5. Conclusions 

We have proposed a new approach to the management of 
the (nowadays) huge amount of data/information modern 
astronomical experiments produce. In particular we have 



